Speaker dependent mapping for low bit rate coding of throat microphone speech
نویسندگان
چکیده
Throat microphones (TM) which are robust to background noise can be used in environments with high levels of background noise. Speech collected using TM is perceptually less natural. The objective of this paper is to map the spectral features (represented in the form of cepstral features) of TM and close speaking microphone (CSM) speech to improve the former’s perceptual quality, and to represent it in an efficient manner for coding. The spectral mapping of TM and CSM speech is done using a multilayer feed-forward neural network, which is trained from features derived from TM and CSM speech. The sequence of estimated CSM spectral features is quantized and coded as a sequence of codebook indices using vector quantization. The sequence of codebook indices, the pitch contour and the energy contour derived from the TM signal are used to store/transmit the TM speech information efficiently. At the receiver, the allpole system corresponding to the estimated CSM spectral vectors is excited by a synthetic residual to generate the speech signal.
منابع مشابه
Mapping Speech Spectra from Throat Microphone to Close-Speaking Microphone: A Neural Network Approach
Speech recorded from a throat microphone is robust to the surrounding noise, but sounds unnatural unlike the speech recorded from a close-speaking microphone. This paper addresses the issue of improving the perceptual quality of the throat microphone speech by mapping the speech spectra from the throat microphone to the close-speaking microphone. A neural network model is used to capture the sp...
متن کاملCompensation of Chann Spectrum Freq
Line Spectrum Frequencies (LSFs) is an effective and efficient representation for low bit-rate (LBR) speech coding. It is also appealing to use LSFs in speech or speaker recognition within a digital communication based system. However, the channel effect on LSFs degrades the recognition performance. This paper attempts to treat the problem of channel effect in LSF domain so that the recognition...
متن کاملSpeaker-dependent mapping of source and system features for enhancement of throat microphone speech
A throat microphone (TM) produces speech which is perceptually poorer than that produced by a close speaking microphone (CSM) speech. Many attempts at improving the quality of TM speech have been made by mapping the features corresponding to the vocal tract system. These techniques are limited by the methods used to generate the excitation signal. In this paper a method to map the source (excit...
متن کاملA new statistical excitation mapping for enhancement of throat microphone recordings
In this paper we investigate a new statistical excitation mapping technique to enhance throat-microphone speech using joint analysis of throatand acoustic-microphone recordings. In a recent study we employed source-filter decomposition to enhance spectral envelope of the throat-microphone recordings. In the source-filter decomposition framework we observed that the spectral envelope difference ...
متن کاملThroat microphone signal for speaker recognition
Speaker recognition systems perform better when clean speech signals are used for the task. In the presence of high levels of background noise, speech recorded from a close speaking microphone will be degraded and hence the performance of the speaker recognition system. Use of a transducer held at the throat results in a signal that is clean even in a noisy environment. This paper discusses the...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009